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Abstract 

The  objective  of  this  research  was  to  develop  algorithms  that  can  be  embedded  in  a  hierarchic  co¬ 
ordination  and  control  architecture  for  teams  of  multiple  UAVs.  This  resulted  in  several  algorithms 
that  use  mixed-integer  linear  programming  (MILP)  to  perform  the  activity  and  path  planning  com¬ 
ponents  of  the  team  coordination  problem.  Research  on  this  project  focused  on  implementing  these 
approaches  using  a  receding  planning  horizon  to  improve  the  computational  tractability  and  on 
increasing  the  robustness  of  the  techniques  to  uncertainty  in  the  situational  awareness.  We  have 
also  completed  a  multi-UAV  testbed  that  will  be  used  to  evaluate  various  distributed  and  hierarchic 
control  architectures. 

Main  Accomplishments 

The  following  lists  the  main  accomplishments  of  the  project: 

•  Developed  a  new  receding  horizon  formulation  of  the  task  assignment  (RHTA)  problem  using 
the  decomposition  approach  [1,  2].  The  RHTA  selects  multiple  tasks  for  each  UAV  during 
each,  iteration  of  the  design,  which  enables  greater  coordination  between  the  team  and  can 
result  in  much  better  performance  than  iterative  greedy  assignment  techniques.  This  faster 
task  assignment  algorithm  forms  the  core  of  the  hierarchic  coordination  architecture  using 
“dynamic  sub-teams” . 

•  Modified  the  MILP  trajectory  design  algorithm  to:  (i)  execute  as  a  model  predictive  con¬ 
troller;  (ii)  account  for  external  disturbances  (e.g.,  impact  of  wind  on  the  UAVs);  and  (iii) 
use  improved  linearized  models  of  the  UAV  dynamics.  Validated  the  trajectory  design  using 
a  team  of  three  rovers  [3]  and  a  hardware-in-the-loop  simulation  of  five  UAVs  [4,  5]. 

•  Extended  the  cooperative  path  planning  algorithm  (RH-MILP)  to  3D  [6,  7].  Modified  the 
formulation  to  include  models  of  the  environment  risk  in  the  cost-to-go,  glue,  and  detailed 
paths.  Developed  a  new  pruning  technique  that  significantly  reduces  the  computation  time 
of  the  receding  horizon  algorithm.  This  approach  is  faster,  but  it  still  retains  the  freedom  to 
choose  between  multiple  future  paths  and  has  been  shown  to  work  well  in  practice  [8,  9, 10,  11]. 

•  Developed  a  novel  approach  to  the  decentralized  collision  avoidance  problem  for  multiple 
UAVs  using  our  new  robust  model  predictive  controller  [3,  12].  This  decentralized  Model  Pre¬ 
dictive  Controller  (DMPC)  algorithm  guarantees  robust  satisfaction  of  coupling  constraints 
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and  .offers  a  significant  computation  improvement  over  a  centralized  approach.  The  key  point 
is  that,  while  the  vehicles  are  assumed  to  communicate,  the  solution  process  does  not  iterate, 
so  it  scales  well  with  the  fleet  size  [13,  14]. 

•  The  task  assignment  algorithms  have  been  extended  to  add  robustness  to  uncertainty  in  the 
situational  awareness.  The  receding  horizon  task  assignment  (RHTA)  has  been  extended 
to  solve  problems  with  coupled  reconnaissance  and  strike  objectives  [22,  15].  We  have  also 
developed  a  new  Filter-embedded  Task  Assignment  (FETA)  algorithm  that  gives  a  formal 
method  of  reducing  the  impact  of  disturbances  or  uncertainty  in  the  cost  estimates  in  the 
on-line  task  assignment  [2,  16]. 

•  Completed  the  design  of  the  DURIP-funded  multi-rover  and  multi-UAV  testbeds  and  per¬ 
formed  initial  flight  tests  of  the  path  planning  algorithms  on  the  UAVs  [4,  5,  9,  10,  11,  17]. 

Algorithm  Details 

With  many  vehicles,  obstacles,  and  targets,  the  coordination  of  a  fleet  of  Unmanned  Aerial  Vehicles 
(UAVs)  is  a  very  complicated  optimization  problem,  and  the  computation  time  typically  increases 
very  rapidly  with  the  problem  size.  Previous  research  proposed  an  approach  to  decompose  this  large 
problem  into  task  assignment  and  trajectory  design  problems,  while  capturing  key  features  of  the 
coupling  between  them.  This  enabled  the  control  architecture  to  solve  an  assignment  problem  first 
to  determine  a  sequence  of  waypoints  for  each  vehicle  to  visit,  and  then  concentrate  on  designing 
paths  to  visit  these  pre-assigned  waypoints.  Refs.  [2,  5]  discusses  the  extension  of  that  approach  to 
the  Receding  Horizon  Task  Assignment  (RHTA)  algorithm.  RHTA  was  modified  further  so  that  it 
can  be  executed  in  real-time  when  the  situational  awareness  is  changing  rapidly.  The  calculation 
was  sped  up  by  using  Concert  Technology™  by  ILOG  [18]  to  avoid  the  slow  process  of  transferring 
data  between  different  parts  of  the  solution  algorithm  and  by  using  an  incremental  algorithm  to 
generate  updates  to  the  cost  map  as  the  knowledge  of  the  environment  changes. 

Task  Assignment  Algorithms:  Work  on  this  project  also  investigated  the  role  of  uncertainty  in 
task  assignment  algorithms,  leading  to  robust  techniques  that  mitigate  the  effects  on  the  command 
and  control  decisions.  This  uncertainty  could  result  from  inherent  sensing  errors,  incorrect  prior 
information,  loss  of  communication  with  teammates,  or  adversarial  deception.  Our  analysis  showed 
that  there  are  very  close  similarities  between  the  various  robust  optimization  methods  that  have 
recently  been  proposed  (including  techniques  based  on  interval  uncertainty  models  [19]  and  the 
CVaR  approach  [20]),  suggesting  that  comparable  levels  of  robustness  and  performance  could  be 
achieved  using  a  very  simple  algorithm  [21],  With  this  insight,  we  developed  a  new  version  of  the 
robust  task  assignment  that  is  computationally  tractable  and  yields  levels  of  robustness  that  are 
similar  to  the  more  sophisticated  algorithms  that  are  not  suitable  for  real-time  applications  [15,  19]. 

RHTA  was  also  extended  to  include  reconnaissance  tasks  that  can  be  added  to  a  mission  to  reduce 
the  uncertainty  in  the  environment.  The  optimal  strike/reconnaissance  mission,  which  explicitly 
captures  the  coupling  between  performing  reconnaissance  tasks  and  reducing  the  uncertainty  in  the 
associated  strike  tasks,  is  nonlinear,  but  with  a  change  of  variables  we  showed  that  it  can  be  solved 
as  a  MILP  [15,  22], 

We  also  developed  a  modified  formulation  of  the  task  assignment  that  can  be  used  to  tailor  the 
control  system  to  mitigate  the  effect  of  noise  in  the  situational  awareness  (SA)  on  the  solution  [16]. 
The  approach  taken  here  is  to  perform  the  reassignments  at  the  rate  the  information  is  updated, 
which  enables  the  planner  to  react  immediately  to  any  significant  changes  that  occur  in  the  envi¬ 
ronment.  Also,  rather  than  just  limiting  the  rate  of  change  of  the  plan,  this  new  approach  embeds 


Normalized  Correlation 

Fig.  1:  Comparison  of  the  plan  correlation  over  time  with  and  without  filtering.  The  higher 
correlation  of  the  new  algorithm  shows  that  the  plans  change  much  less  dramatically  as  a  result  of 
changes  in  the  information  available  to  the  planner. 

a  more  sophisticated  filtering  operation  in  the  task  assignment  algorithm.  We  have  shown  that 
this  modified  formulation  can  be  interpreted  as  a  noise  rejection  algorithm  that  reduces  the  effect 
of  the  high  frequency  noise  on  the  planner.  A  key  feature  of  this  filter-embedded  task  assignment 
algorithm  is  that  the  coefficients  of  the  filter  are  tuned  online  using  the  past  information.  Fig.  1 
shows  that  adding  our  filtering  tends  to  increase  the  correlation  between  plans  from  one  time-step 
to  the  next,  which  decreases  the  variation  in  the  plans.  This  means  that  the  task  assignment 
is  returning  the  same  solution  even  though  the  data  in  the  problem  is  changing  slightly  due  to 
noise/disturbances/uncertainty  in  the  cost  estimates.  The  unfiltered  results  show  lower  correla¬ 
tion,  which  means  the  plans  are  changing  and  the  vehicles  would  be  re-assigned  to  new  tasks  (each 
plan  might  be  optimal  at  that  time-step,  but  this  can  lead  to  a  “churning”  type  of  behavior  wherein 
the  vehicles  flip  back  and  forth  between  assignments  [23].) 

We  have  also  addressed  the  problem  of  weapon  target  assignment  in  a  risky  environment  [24]. 
Two  formulations  were  developed.  The  first  is  simple  to  solve,  but  the  objective  function  ignores 
the  effect  that  the  tasks  performed  by  some  of  the  weapons  can  have  on  the  risk/performance  of 
the  other  weapons.  The  resulting  targeting  process  is  shown  to  be  coordinated,  but  because  it 
ignores  this  interaction,  it  is  what  we  call  non-cooperative.  The  second  formulation  accounts  for 
this  interaction  and  solves  for  the  optimal  cooperative  strategy  using  Dynamic  Programming.  Two 
approximation  methods  were  investigated  for  these  cooperative  problems,  and  these  are  shown  to 
achieve  near-optimal  solutions  with  computation  times  that  are  suitable  for  on-line  implementation. 
The  results  from  numerous  simulations  clearly  show  the  benefits  of  cooperative  strategies  over  just 
coordinated  ones  [24], 

MILP  for  Path  Planning:  References  [6,  7,  8,  25]  outline  our  path  planning  approach  which  uses 
MILP  to  compute  a  short,  detailed  trajectory  around  obstacles,  no-fly-zones,  and  other  vehicles 
using  an  estimate  of  the  cost-to-go  from  a  shortest  path  algorithm.  The  research  in  this  project 
extended  this  receding  horizon  approach  (called  RH-MILP)  in  several  ways: 

•  Developed  a  new  formulation  of  RH-MILP  for  3D  paths  [7] .  The  approach  is  similar  to  our 
previous  2D  algorithms  that  construct  a  coarse  cost  map  to  provide  approximate  paths  from  a 


sparge  set  of  nodes  to  the  goal  and  then  use  MILP  optimization  to  design  the  detailed  part  of 
the  trajectory.  The  cost  map  calculation  was  modified  to  account  for  possible  vertical  vehicle 
maneuvers  [7]. 

•  Embedded  a  new  pruning  algorithm  in  RH-MILP  to  significantly  reduce  the  computation 
time  [6].  The  approach  is  much  faster,  but  it  still  retains  the  flexibility  to  choose  better  paths 
around  obstacles,  and  has  been  shown  to  work  well  in  practice  [4,  9,  11]. 

•  Included  environmental  uncertainty /risk  in  the  RH-MILP  cost-to-go.  Developed  a  new  al¬ 
gorithm  for  approximately  solving  robust  shortest  path  problems  (called  ARSP)  that  yields 
levels  of  performance  that  are  comparable  to  previously  published  algorithms,  but  is  signif¬ 
icantly  faster  (only  approximately  2.5  times  the  computational  effort  to  solve  the  nominal 
problem)  [17]. 


Fig.  2:  Three  dimensional  trajectory  in  a  complicated  environment  with  risks  using  a  mid-level 
weighting  on  altitude. 

Fig.  2  shows  an  example  scenario  for  the  3D  RH-MILP.  With  a  low  penalty  on  altitude,  the 
vehicle  just  flies  over  all  of  the  obstacles  so  that  the  resulting  trajectory  is  effectively  a  straight  line 
connecting  the  start  and  goal.  With  a  very  large  altitude  penalty,  the  vehicle  avoids  climbing  over 
any  of  the  obstacles  and  simply  flies  around  them  at  a  very  low  altitude  —  the  2D  solution.  Fig.  2 
shows  a  trajectory  with  medium  penalty,  for  which  the  vehicle  flies  around  the  larger  obstacles,  but 
decides  to  fly  over  the  first-story  of  the  two-story  obstacle  near  the  start  of  the  trajectory  (which 
is  directly  in  the  way),  skirting  around  the  outside  of  the  second  story. 

Model  Predictive  Control:  Receding  horizon  control  is  often  referred  to  as  Model  Predictive  Control 
(MPC),  and  our  other  research  has  developed  MPC  formulations  in  a  more  general  setting,  with  ap¬ 
plications  to  the  RH-MILP  problem.  In  particular,  we  have  developed  a  new  robust  MPC  (RMPC) 
approach  that  uses  constraint  tightening  [26]  with  a  more  general  candidate  policy,  thereby  leading 
to  a  less  constrained  optimization  and  hence  a  less  conservative  controller  [12,  13].  The  approach 
retains  “margin”  for  future  feedback  action,  which  becomes  available  to  the  MPC  optimization  as 
time  progresses.  Since  robustness  follows  only  from  the  constraint  modifications,  only  nominal  pre¬ 
dictions  are  required,  avoiding  both  the  large  growth  in  problem  size  associated  with  incorporating 
multivariable  uncertainty  in  the  prediction  model  and  the  conservatism  associated  with  worst  case 
cost  predictions,  a  common  alternative. 


Fig.  3  shows  position  time  histories  for  100  simulations  of  a  double  integrator  system  using  nominal 
MPC.  The  position  constraint  is  shown  dashed,  the  control  was  constrained  to  have  unit  magnitude 
or  less  and  a  random  disturbance  of  up  to  20%  of  the  control  was  included.  Each  o  marks  the  end 
of  a  simulation  as  the  problem  became  infeasible.  Fig.  4  shows  the  same  results  using  robust  MPC 
with  constraint  tightening.  Observe  that  the  position  goes  right  to  the  constraint  but  never  crosses 
it,  remaining  feasible  throughout. 

Decentralized  MPC  The  same  concept  has  been  used  to  develop  a  decentralized  MPC  (DMPC) 
algorithm  for  multiple  subsystems  with  hard,  coupled  constraints.  Multiple  UAVs  with  collision 
avoidance  constraints  form  an  example  of  this  class  of  systems  [13,  14].  The  algorithm  scales  much 
better  than  a  centralized  approach  as  each  subsystem  has  an  individual  planning  optimization  solv¬ 
ing  only  for  its  own  actions.  The  actions  of  other  subsystems  are  accounted  for  by  communication, 
but  feasible  solutions  are  guaranteed  and  it  is  not  necessary  to  iterate  between  subsystems  to  check 
feasibility.  The  subproblems  are  solved  sequentially,  and  constraint  tightening  is  employed  to  ensure 
that  each  subproblem  has  at  least  one  feasible  solution,  given  a  feasible  solution  to  the  preceding 
subproblem. 


To  demonstrate  the  improvement  in  scalability,  DMPC  was  applied  to  a  multi-UAV  collision 
avoidance  problem  -  50  random  instances  were  done  for  each  fleet  size  and  compared  with  cen¬ 
tralized  robust  MPC  for  the  same  problems.  Fig.  5  shows  a  typical  scenario.  The  median  solution 
times  are  shown  in  Fig.  6.  Note  the  different  scales  on  the  upper  and  lower  plots,  and  that  the 


decentralized  solution  times  are  broken  down  by  subproblem  but  shown  stacked,  as  they  are  solved 
sequentially.  For  5  vehicles,  computation  time  was  improved  by  a  factor  of  20  or  more  [14]. 

The  DMPC  algorithm  was  also  extended  to  explic¬ 
itly  account  for  delays  in  the  system,  arising  from 
both  the  computation  of  each  control  optimization 
and  the  communication  between  vehicles.  The  algo¬ 
rithm  was  demonstrated  in  hardware,  using  wheeled 
robot  vehicles  (Fig.  7)  to  emulate  UAVs.  MILP  opti¬ 
mization  was  used  in  real-time  within  the  DMPC  al¬ 
gorithm  to  solve  the  nonconvex  trajectory  optimiza¬ 
tions.  Fig.  8  shows  trajectories  from  experiments  Fig.  7:  Three  rover  experimental  setup. 
using  three  rovers.  In  the  first  figure,  the  target  boxes  are  at  the  bottom  right,  and  rover  1  must 
change  its  path  significantly  to  avoid  collisions.  The  last  two  plots  show  a  different  scenario,  in 
which  rovers  1  and  3  must  swap  positions  and  rover  2  crosses  both  their  paths.  In  these  cases,  all  the 
rovers  take  indirect  paths  to  avoid  collisions.  These  experimental  results  confirmed  that  the  modi¬ 
fied  algorithm  can  operate  successfully  in  the  presence  of  realistic  computation  and  communication 
delays. 


Fig.  8:  DMPC  Results  for  Three  Rovers.  Numbers  mark  the  starting  points  of  each  rover  and  the 
target  regions  are  shown  by  boxes. 


UAV  Testbed  Demonstrations 

The  UAV  testbed  shown  in  Fig.  9  was  developed  to  validate  and  evaluate  the  coordination  and 
control  approaches  [5,  10].  This  work  was  motivated  by  the  observation  that  a  key  step  towards 
transitioning  these  high-level  algorithms  to  future  missions  will  be  to  successfully  demonstrate  that 
they  can  handle  similar  challenges  on  scaled  vehicles  operating  in  realistic  environments.  A  wireless 
video  system  was  integrated  with  the  UAV  testbed  to  produce  high  quality  images  from  the  airborne 
vehicles.  Fig.  10  shows  a  typical  aerial  shot  from  one  of  the  UAVs.  This  system  is  used  to  track 
stationary  and  moving  objects  on  the  ground  and  provide  feedback  to  the  operator.  The  status  at 
the  end  of  the  project  was: 

•  UAV  testbed  has  been  operated  autonomously  on  numerous  (>  40)  occasions  [4],  Fig.  11 
shows  the  results  of  a  22  min.  autonomous  flight  involving  two  UAVs  simultaneously  flying 


Fig.  9:  UAV  testbed  with  8  identical  aircraft. 


Fig.  10:  Image  from  onboard  video.  Fig.  11:  Data  from  2  UAVs  on  same  plan.  50m 

offset  applied  for  easier  viewing. 

the  same  flight  plan.  Both  vehicles  tracked  the  waypoints  in  the  presence  of  wind  and  open 
loop  formation  flight  was  achieved  by  adjusting  the  commanded  speed  until  the  vehicles  were 
in  phase  with  one  another.  A  50m  vertical  offset  was  applied  to  the  data  to  allow  for  easier 
viewing. 

•  Implemented  RH-MILP  on  the  UAVs  [4,  10,  11].  The  results  were  successful,  but  they  high¬ 
lighted  the  need  to  account  for  the  effect  of  wind  disturbances  on  the  entire  planning  system. 
This  also  requires  that  the  plans  be  robust  to  flight  time  uncertainty  and  that  the  planner 
can  rapidly  adapt  to  variations  in  the  execution. 

•  Developed  a  flexible  GUI  for  designing  mission  scenarios,  which  is  a  challenging  problem  when 
there  are  many  vehicles  and  targets  and  the  environment  is  dynamic.  The  interface  can  be 
used  to  layout  the  scenario  prior  to  the  mission.  It  can  also  be  used  during  the  mission  to 
provide  the  operator  with  a  visualization  of  the  current  plan,  enabling  them  to  interact  with 
the  optimization  algorithms  [17]. 
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Transitions 

There  have  been  several  key  transitions  of  the  technology  as  part  of  this  program: 


•  Key  interactions  with  Robert  Miller  at  Northrop  Grumman  (Oct  2003-present). 

•  Working  with  Jerry  Wohletz,  Kathleen  Misovec,  and  Jorge  Tierno  at  AlphaTech  (now  BAE) 
from  Jan  2004  -  June  2005  on  an  STTR  (phase-1). 

•  Our  MILP  path  planning  algorithm  were  successfully  demonstrated  on  the  Boeing  OOP 
platform  as  part  of  the  DARPA  SEC  program  [27,  28] . 

•  Dr.  A.  Richards  (former  student)  is  now  a  Lecturer  in  Controls  and  Dynamics,  Dept,  of 
Aerospace  Engineering,  University  of  Bristol 
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